Derivative pricing using neural networks#
Over the past few years, advancements in artificial intelligence and machine learning have naturally led to the adoption of these techniques in many other fields. Neural networks, for example, have become a highly flexible tool capable of solving a wide variety of problems such as image classification, speech recognition, and language translation.
Today, we also see their application in various industries, including finance. In this notebook, we demonstrate how these models can be used—for instance, to price derivatives. Although derivative pricing is a relatively simple and well-studied problem, it provides an interesting example to illustrate how neural networks can solve financial problems, not only in pricing but also in inverse problems (e.g., model calibration) and risk management.
In this entry, we focus on a particular technique: Physics-Informed Neural Networks (PINNs). As the name suggests, these neural networks are designed to incorporate additional information derived from physical laws or known mathematical relationships. Typically, this is achieved by embedding these laws into the loss function to guide the network toward a solution that satisfies the given differential equation.
What are Physics-Informed Neural Networks (PINNs)?#
First we begin with some definitions. Suppose we wish to solve a partial differential equation (PDE) of the form \( \mathcal{N}[V(\mathbf{x})] = 0,\quad \mathbf{x} \in \Omega, \) where:
\(\mathcal{N}\) is a differential operator,
\(V(\mathbf{x})\) is the unknown solution,
\(\Omega\) is the spatial (and possibly temporal) domain.
In a PINN, we approximate \(V(\mathbf{x})\) with a neural network \(V_\theta(\mathbf{x})\) parameterized by \(\theta\). The loss function is constructed to penalize deviations from both the governing PDE in the domain and the prescribed boundary conditions on \(\partial \Omega\). Formally, the loss can be written as:
with
and
where:
\(\{\mathbf{x}_r^i\}_{i=1}^{N_r}\) are the collocation points in the interior of the domain \(\Omega\),
\(\{\mathbf{x}_b^j\}_{j=1}^{N_b}\) are the points on the boundary \(\partial \Omega\) with known values \(g(\mathbf{x}_b^j)\).
In order to train the network, we minimize the loss function using gradient-based optimization techniques. The resulting neural network will approximate the solution to the PDE in the domain \(\Omega\) and satisfy the boundary conditions on \(\partial \Omega\).
The idea is not to overcomplicate things, so we start with the example. First, we import the necessary libraries:
Show code cell source
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
from torch.autograd import grad
from scipy.stats import norm
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
np.random.seed(42)
torch.manual_seed(42)
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.2.3 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.
Traceback (most recent call last): File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/ipykernel_launcher.py", line 18, in <module>
app.launch_new_instance()
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/traitlets/config/application.py", line 1075, in launch_instance
app.start()
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/ipykernel/kernelapp.py", line 739, in start
self.io_loop.start()
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/tornado/platform/asyncio.py", line 205, in start
self.asyncio_loop.run_forever()
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/asyncio/base_events.py", line 608, in run_forever
self._run_once()
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/asyncio/base_events.py", line 1936, in _run_once
handle._run()
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/asyncio/events.py", line 84, in _run
self._context.run(self._callback, *self._args)
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/ipykernel/kernelbase.py", line 545, in dispatch_queue
await self.process_one()
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/ipykernel/kernelbase.py", line 534, in process_one
await dispatch(*args)
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/ipykernel/kernelbase.py", line 437, in dispatch_shell
await result
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/ipykernel/ipkernel.py", line 362, in execute_request
await super().execute_request(stream, ident, parent)
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/ipykernel/kernelbase.py", line 778, in execute_request
reply_content = await reply_content
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/ipykernel/ipkernel.py", line 449, in do_execute
res = shell.run_cell(
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/ipykernel/zmqshell.py", line 549, in run_cell
return super().run_cell(*args, **kwargs)
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3077, in run_cell
result = self._run_cell(
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3132, in _run_cell
result = runner(coro)
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/IPython/core/async_helpers.py", line 128, in _pseudo_sync_runner
coro.send(None)
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3336, in run_cell_async
has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3519, in run_ast_nodes
if await self.run_code(code, result, async_=asy):
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3579, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "/var/folders/cp/l432p0ns1t38qmnyr_3l3r000000gn/T/ipykernel_77675/4220564520.py", line 1, in <module>
import torch
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/torch/__init__.py", line 1477, in <module>
from .functional import * # noqa: F403
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/torch/functional.py", line 9, in <module>
import torch.nn.functional as F
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/torch/nn/__init__.py", line 1, in <module>
from .modules import * # noqa: F403
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/torch/nn/modules/__init__.py", line 35, in <module>
from .transformer import TransformerEncoder, TransformerDecoder, \
File "/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/torch/nn/modules/transformer.py", line 20, in <module>
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
/Users/josemelo/Desktop/dev/libs/notebook/.conda/lib/python3.11/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/utils/tensor_numpy.cpp:84.)
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
<torch._C.Generator at 0x115977790>
Loss functions and the governing equation#
For the problem of pricing options, the fundamental PDE we need to solve is the Black-Scholes equation. This equation is a parabolic PDE that describes the evolution of the price \(V(S,t)\) of an option in terms of time \(t\) and the underlying asset price \(S\). It is given by:
where:
\(V(S,t)\) is the option price,
\(S\) is the price of the underlying asset,
\(r\) is the risk-free interest rate,
\(\sigma\) is the volatility of the underlying asset.
Our objective is to use this PDE to guide the training of our neural network. Modern machine learning libraries, such as PyTorch, offer automatic differentiation capabilities. For example, torch.autograd.grad efficiently computes the necessary derivatives, such as \(\nabla V(S,t)\), at each training point. In our fomulation, \(\mathcal{L}_{\mathrm{PDE}}(\theta)\) would be represented by the following function:
def pde_loss_f(model, inputs, sigma, r):
inputs.requires_grad_(True)
V = model(inputs)
# First order
gradients = grad(V, inputs, grad_outputs=torch.ones_like(
V), create_graph=True)[0]
dVdt = gradients[:, 0]
dVdS = gradients[:, 1]
# Second order
d2VdS2 = grad(dVdS, inputs, grad_outputs=torch.ones_like(
dVdS), create_graph=True)[0][:, 1]
S = inputs[:, 1]
V = V[:, 0]
# PDE
pde_residual = dVdt + 0.5 * sigma ** 2 * S ** 2 * d2VdS2 + r * S * dVdS - r * V
loss_pde = torch.mean(pde_residual ** 2)
return loss_pde
Additionally, to incorporate the boundary conditions, we need to define an additional loss functions that force the network to satisfy these constraints. Since the boundary conditions we are using are of Dirichlet type, the loss function is relatively simple (it merely compares the network outputs against the ground truth). However, nothing prevents us from employing other types of boundary conditions (such as Neumann or Robin).
def boundary_loss_f(model, inputs, outputs):
V_pred = model(inputs)
boundary_loss = torch.mean((V_pred - outputs) ** 2)
return boundary_loss
Collocation points#
To solve the PDE using PINNs, we need to define the numerical domain over which the equation will be solved. Much like traditional numerical methods (e.g., finite differences), we select a set of points -called collocation points- where the PDE is enforced. These include points in the interior of the domain \(\Omega\) as well as on its boundary \(\partial\Omega\).
To enforce the boundary conditions, additional loss terms are added that force the network to satisfy them. For a European call option, the typical boundary conditions are:
At \(S = 0\): $\( V(0, t) = 0, \)$ since if the underlying asset’s price is zero, the option is worthless.
For \(S \to \infty\): $\( V(S, t) \approx S - K e^{-r(T-t)}, \)\( as the option behaves like a linear payoff for large \)S$.
At expiration \(t = T\): $\( V(S, T) = \max(S - K,\, 0), \)$ which is simply the payoff of the option.
We sample this points using random numbers, and we will use them to train the neural network. Next is an implementation for sampling this points:
def payoff(s, k):
return np.maximum(s-k, 0)
def interior_samples(option_config, scaling=4):
t = np.random.uniform(
0, option_config['maturity'], option_config['n_samples'])
s = np.random.uniform(
0, scaling*option_config['strike'], option_config['n_samples'])
tag = np.zeros(option_config['n_samples'])
output = np.zeros(option_config['n_samples'])
return t, s, tag, output
def top_boundary_samples(option_config, scaling=4):
t = np.random.uniform(
0, option_config['maturity'], option_config['n_samples'])
s = np.ones(option_config['n_samples']) * scaling*option_config['strike']
strike = option_config['strike']
s_max = scaling*option_config['strike']
r = option_config['r']
T = option_config['maturity']
output = s_max - strike * np.exp(-r*(T-t))
tag = np.ones(option_config['n_samples'])
return t, s, tag, output
def bottom_boundary_samples(option_config):
t = np.random.uniform(
0, option_config['maturity'], option_config['n_samples'])
s = np.zeros(option_config['n_samples'])
output = np.zeros(option_config['n_samples'])
tag = np.ones(option_config['n_samples'])
return t, s, tag, output
def initial_condition_samples(option_config, scaling=4):
t = np.ones(option_config['n_samples']) * option_config['maturity']
s = np.random.uniform(
0, scaling*option_config['strike'], option_config['n_samples'])
tag = np.ones(option_config['n_samples'])
output = payoff(s, option_config['strike'])
return t, s, tag, output
Important
The choice of collocation points is critical for the convergence of the method. One potential improvement is to use quasi-random numbers to distribute the collocation points more uniformly across the domain, thereby avoiding clustering that can occur with some random number generators.
Of course, we need to define the financial parameters that will be used in the problem. For this example, we will use the following values:
config = {'strike': 15, 'maturity': 1,
'r': 0.04, 'sigma': 0.25, 'n_samples': 10000}
# Generate samples
inner_samples = interior_samples(config)
top_samples = top_boundary_samples(config)
bottom_samples = bottom_boundary_samples(config)
initial_samples = initial_condition_samples(config)
Visually, this is how our domain looks like: